Speech Modelling Using Subspace and EM Techniques

نویسندگان

  • Gavin Smith
  • João F. G. de Freitas
  • Tony Robinson
  • Mahesan Niranjan
چکیده

Tony Robinson Cambridge University Engineering Department Cambridge CB2 IPZ England [email protected] The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be estimated using an expectation-maximisation (EM) algorithm. One problem is the initialisation of the EM algorithm. Standard initialisation schemes can lead to poor formant trajectories. But these trajectories however are important for vowel intelligibility. The aim of this paper is to investigate the suitability of subspace identification methods to initialise EM. The paper compares the subspace state space system identification (4SID) method with the EM algorithm. The 4SID and EM methods are similar in that they both estimate a state sequence (but using Kalman filters and Kalman smoothers respectively), and then estimate parameters (but using least-squares and maximum likelihood respectively). The similarity of 4SID and EM motivates the use of 4SID to initialise EM. Also, 4SID is non-iterative and requires no initialisation, whereas EM is iterative and requires initialisation. However 4SID is sub-optimal compared to EM in a probabilistic sense. During experiments on real speech, 4SID methods compare favourably with conventional initialisation techniques. They produce smoother formant trajectories, have greater frequency resolution, and produce higher likelihoods. 1 Work done while in Cambridge Engineering Dept., UK. Speech Modelling Using Subspace and EM Techniques 797

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement Through an Optimized Subspace Division Technique

The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...

متن کامل

Speech Enhancement Through an Optimized Subspace Division Technique

The speech enhancement techniques are often employed to improve the quality and intelligibility of the noisy speech signals. This paper discusses a novel technique for speech enhancement which is based on Singular Value Decomposition. This implementation utilizes a Genetic Algorithm based optimization method for reducing the effects of environmental noises from the singular vectors as well as t...

متن کامل

Constrained Subspace Modelling

When performing subspace modelling of data using Principal Component Analysis (PCA) it may be desirable to constrain certain directions to be more meaningful in the context of the problem being investigated. This need arises due to the data often being approximately isotropic along the lesser principal components, making the choice of directions for these components more-or-less arbitrary. Furt...

متن کامل

Modelling Decision Problems Via Birkhoff Polyhedra

A compact formulation of the set of tours neither in a graph nor its complement is presented and illustrates a general methodology proposed for constructing polyhedral models of decision problems based upon permutations, projection and lifting techniques. Directed Hamilton tours on n vertex graphs are interpreted as (n-1)- permutations. Sets of extrema of Birkhoff polyhedra are mapped to tours ...

متن کامل

A signal subspace approach for speech modelling and classification

In this paper, a speech classifier inspired by the signal subspace approach is developed. A novel signal subspace speech model is initially obtained via a rank reducing subspace decomposition algorithm that is based on the SVD. Motivated by the assumption that the speech signal comprises of short term dynamics that are slowly changing, it follows that the signal subspace of the speech signal is...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999